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Abstract 
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We obtain a lower bound of Q ( „t. , ra t+ ! , — r 1 on the £;-party randomized communication 

I 2 2k {k-i)2 k - 1 J 

complexity of the Disjointness function in the 'Number on the Forehead' model of multiparty 
, communication. In particular, this yields a bound of nP^ when A: is a constant. The previous 

' best lower bound for three players until recently was tt(\ogn). 

Our bound separates the communication complexity classes NP^ C and BPP^ C for k = 
o(loglog?i). Furthermore, by the results of Beame, Pitassi and Segerlind b 4 , our bound implies 
proof size lower bounds for tree-like, degree k — 1 threshold systems and superpolynomial size 
C*~) ' lower bounds for Lovasz-Schrijver proofs. 

Sherstov |16j recently developed a novel technique to obtain lower bounds on two-party 
communication using the approximate polynomial degree of boolean functions. We obtain our 
results by extending his technique to the multi-party setting using ideas from Chattopadhyay 

m : !• 

A similar bound for Disjointness has been recently and independently obtained by Lee and 
1 Shraibman. 
00 ' 

o 



1 Introduction 



Chandra, Furst and Lipton [7] introduced the 'Number on the Forehead' model of multiparty 
communication as an extension of Yao's [20] two party communication model. This model, besides 
being interesting in its own right, has found numerous connections with circuit complexity, proof 
complexity, branching programs, pseudo-random generators and other areas of theoretical computer 
science. 

Both proving upper and lower bounds for this model remain a very challenging task as it is 
known that the overlap of information accessible to players provides significant power to it. In fact, 
proving a super-polylogarithmic lower bound on the communication needed by poly-logarithmic 
number of players for computing a function / in the restricted setting of simultaneous deterministic 
communication, is enough to show that / is not in ACC°, a class for which no strong bounds are 
known. Although several efforts [2[ [9j [HI [10] have been made, this goal currently remains out of 
reach as no superlogarithmic lower bounds exist for even logn players. 

More modestly, one would like to be able to determine the communication complexity of simple 
functions for at least constant number of players. However, despite intensive research (see for 
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example [3 El [T9l I18j ) the best known lower bounds on the communication complexity of simple 
functions like Disjointness and Pointer Jumping was fi(logn) even for three players. The root cause 
of this problem is that there was essentially only one method that was the backbone of almost all 
strong lower bounds. This method is known as the discrepancy method and was introduced in 
the seminal work of Babai, Nisan and Szegedy [2]. It is however known that for functions like 
Disjointness this method at best yields fi(logn) lower bounds. 

Razborov [15] introduced the multi-dimensional discrepancy method to establish a tight re- 
lationship between the quantum communication complexity of functions induced by a symmetric 
base function and the approximation degree of the base function. Recently, Sherstov [16] devel- 
ops an elegant technique that is simpler and generalizes the results of Razborov by obviating the 
need for the base function to be symmetric. More importantly for us, the technique in [16] shows 
that the classical discrepancy method can be modified in a natural way that allows one to obtain 
strong bounds on two-party quantum communication with bounded error even for functions like 
Disjointness that have large discrepancy. In this work, we suitably modify this technique to extend 
it to the multi-party setting. In order to achieve this, we use tools developed in Chattopadhyay [8], 
extending the earlier work of Sherstov [T7], for estimating discrepancy under certain non-uniform 
distributions. 

Our result has interesting consequences for communication complexity classes and proof com- 
plexity. It provides the first example of an explicit function that has small non-deterministic 
communication complexity, but exponentially high randomized complexity. In the language of 
complexity classes, this separates BPP^ and NP^ for k = o(loglogn). In fact, the separation 
is exponential when k is any constant. Although such a separation was already known from the 
work of [3], before our work no explicit function was known to separate these classes. By the work 
of Beame, Pittasi and Szegerlind [4], our lower bounds on the fc-party complexity of Disjointness 
implies strong lower bounds on the proof size for a family of proof systems known as tree-like, 
degree k — 1 threshold systems. Proving lower bounds for these systems was a major open problem 
in propositional proof complexity. 

1.1 Our Main Result 

Let J/ 1 ,... ,y k ~ 1 be k — 1 n-bit binary strings. Define the k — 1 x n boolean matrix A obtained 
by placing y l in the ith row of A. For x G {0,1}™, let x <= y 1 , . . . ,y k ~ 1 be the n-bit string 
Xi x Xi 2 . . . Xj t n_i , where i\, . . . , it are the indices of the all-one columns of A. 

Let g : {0,1}™ -> {-1,1} be a base function. We define G\ : ({0,l}™) fc -► {-1,1} by 
Gi (x, y 1 , . . . , y k ~ l ) := g{x <= y 1 , . . . , y k ~ 1 ). Observe that (^P ARITY i s the Generalized Inner Prod- 
uct function and G^ OR is the Disjointness function. Our main result shows how to use the high 
approximation degree of a base function to generate a function with high randomized communica- 
tion complexity. 

Let R%(f) denote the randomized fc-party communication complexity of / with advantage e. 
Then, 

Theorem 1.1. Let f : {0, l} m — > { — 1, 1} have 5 -approximate degree d. Let n > (- — ^ 1 - )e ) 1 m k , 
and f : {0, l} n -» {-1, 1} be such that f{z) = /'(zO™"" 1 ). Then 

^(G{')>^T+log(5 + 2e-l). 



2 



As a corollary we show that 



(77 fc+1 \ 

for every constant e > 0. In brief, this follows from the following facts. Let NOR„ denote the NOR 
function for inputs of length n. Then /' = NOR n and / = NOR m satisfy f(z) = f'(zO n ~ m ) and by 
a result of Paturi [13], we know that the 1/3-approximate degree of NOR m is 0(y / 77i). 

A similar bound for the Disjointness function has been recently and independently obtained by 
Lee and Shraibman |12j . 

1.2 Proof Overview 

Sherstov [16] devised a novel strategy to make a passage from approximation degree of boolean 
functions to lower bounds on two-party communication complexity. We adapt this strategy for our 
purpose. This adaptation is outlined in Figure [TJ 

We use three main ingredients, the first of which is the Generalized Discrepancy Method. 
The classical discrepancy method states that if a function has low discrepancy, then it has high 
randomized communication complexity. In the generalized discrepancy method this idea is extended 
as follows: If a function g correlates well with / and has low discrepancy, then / has high randomized 
communication complexity. 

The second ingredient is the "Approximation/Orthogonality Principle" of Sherstov [16J. It 
states that given a function / with high approximation degree, we can find a function g that 
correlates well with /, and a distribution \x such that g is orthogonal to every low degree polynomial 
under fi. 

The third ingredient, called the Orthogonality-Discrepancy Lemma, is derived from the work 
of Chattopadhyay [8j. This takes a function that is orthogonal with low degree polynomials and 
constructs a new masked function that has low discrepancy. 

We can then summarize the strategy as follows. We start with a function / : {0, 1}™ — > {—1, 1} 
with high approximation degree. By the Approximation/Orthogonality Principle, we obtain g that 
highly correlates with / and is orthogonal with low degree polynomials. From / and g we construct 
new masked functions Fj[ and F^, similar to the construction of GI. Since g is orthogonal to 
low degree polynomials, by the Orthogonality-Discrepancy Lemma we deduce that F? has low 
discrepancy under an appropriate distribution. Under this distribution F? and Fj[ are highly 
correlated and therefore applying the Generalized Discrepancy Method, we conclude that Fl has 
high randomized communication complexity. This implies, by the construction of F/, that the 
randomized communication complexity of g£ is high. 

2 Preliminaries 

2.1 Multiparty Communication Model 

In the multiparty communication model introduced by [7) , k players Pi , . . . , P^ wish to collabo- 
rate to compute a function / : {0, l} n — ► {—1, 1}. The n input bits are partitioned into k sets 
X\ , . . . , Xk C [n] and each participant Pi knows the values of all the input bits except the ones 
of Aj. This game is often referred to as the "Number/Input on the forehead" model since it is 
convenient to picture that player i has the bits of X{ written on its forehead, available to everyone 
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Figure 1: Proof outline 



but itself. Players exchange bits, according to an agreed upon protocol, by writing them on a public 
blackboard. The protocol specifies whose turn it is to speak, and what the player broadcasts as 
a function of the communication history and the input the player has access to. The protocol's 
output is a function of what is on the blackboard after the protocol's termination. We denote by 
Dk(f) the deterministic A:-party communication complexity of /, i.e. the number of bits exchanged 
in the best deterministic protocol for / on the worst case input. 

By allowing the players to access a public random string and the protocol to err, one defines 
the randomized communication complexity of a function. We say that a protocol computes / with 
e advantage if the probability that V and / agree is at least 1/2 + e for all inputs. We denote 
by R%{f) the cost of the best protocol that computes / with advantage e. One further introduces 
non-determinism in protocols by allowing 'God' to help the players by furnishing a proof string. 
As is usual with non-determinism in other models, a correct non-deterministic protocol V for / has 
the following property: on every input x at which f(x) = — 1, V(x,y) = — 1 for some proof string 
y and whenever f(x) = 1, V(x,y) = 1 for all proof strings y. The length of the proof string y is 
now included in the cost of V on an input and Nk(f) denotes the cost of the best non-deterministic 
protocol for / on the worst input. 

Communication complexity classes were introduced for two players in pQ in which "efficient" 
protocol was defined to have cost no more than polylog{n). This idea naturally extends to 
the multiparty model giving rise to the following classes: P& := {f\Dk(f) = polylog(n)}, 



BPP<r- c := {f\R k ' (/) = polylog(n)} and NP£ C := {f\N k (f) = polylog(n)}. Determining the 



relationship among these classes is an interesting research theme within the broader area of un- 
derstanding the relative power of determinism, non-determinism and randomness in computation. 
While Beame et.al. [3] show that BPP^ 7 / NP^ C , no explicit function was known that separated 
these classes. 

2.2 Cylinder Intersections and Discrepancy 

The key combinatorial object that arises in the study of multiparty communication is a cylinder- 
intersection. A A:-cylinder in the ith. dimension is a subset S of Y\ x • • • x with the property that 
membership in S is independent of the ith coordinate. A set S is called a cylinder-intersection if 
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S = n^ =1 5j, where Si is a cylinder in the ith dimension. One can represent a £;-cylinder in the ith 
dimension by its characteristic function (f) 1 : ({0, l} n ) k — ► {0, 1}. Here ^(yi, • does not depend 
on yi. A cylinder intersection is represented as the product 

<t>[yi, -,yk) = $ ~{vx, -,yk)-(P k (yi, -,yk)- 

It is well known that a protocol that computes / with cost c partitions the input space of / 
into at most 2 C monochromatic cylinder intersections. 

An important measure, defined for a function / : Y\ x ... X Y k — * { — 1,1}, is its discrepancy. 
With respect to any probability distribution fj, over Y\ X • • • X Yk and cylinder intersection <j>, define 



Pr [f{vx, ■ ■ ■ ,Vk) = 1 A 4>(yi, . .. ;y k ) = ll 
- Pr [/(yi, . . . , y k ) = -1 A </>(yi, ...,y k ) = l] 
Since / is -1/1 valued, it is not hard to verify that equivalently: 



d<,(/) 



E 



f(yi, ■ ■ -,yk)4>{yu ■ ■ -,yk) 



(i) 



The discrepancy of / w.r.t. fx, denoted by discfc jM (/) is max^disc^ (/). For removing notational 
clutter, we often drop fi from the subscript when the distribution is clear from the context. We now 
state the discrepancy method which connects the discrepancy and the randomized communication 
complexity of a function. 

Theorem 2.1 (see [21 [H]). Let < e < 1/2 be any real and k > 2 be any integer. For every 
function f : Y\ x ... x Y k — > {1,-1} and distribution [i on inputs from Y\ x • • • X Y k , 



RUf) > log 



2e 

disCkJf) 



(2) 



2.3 Fourier Expansion 

We consider the vector space of functions from {0, l} n — ► R. Equip this space with the standard 
inner product (/, g) 

(f,g)=M x „ u f(x)g(x) (3) 

For each S C [n], define xs( x ) = (—l)^ i€sXi ■ Then it is easy to verify that the set of functions 
{xsl'S' Q [n]} forms an orthonormal basis for this inner product space, and so every / can be 
expanded in terms of its Fourier coefficients 

fix) = f(S)xs(x) (4) 
SC[n] 

where f(S) is defined as (f,Xs)- This expansion is unique and the exact degree of / is defined to 
be the largest d such that there exists S C [n] with \S\ = d and f(S) / 0. 
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2.4 Approximation Degree 

A natural question is the following. How large degree is needed if we want to simply approximate 
/ well? Define the e- approximate degree of /, denoted by deg e (/) to be the smallest integer d for 
which there exists a function (p of exact degree d such that 



f(x) - 4>{x) 



< e 



max x£{0,l} n 

For any D : {0, 1, . . . , n} — > {1, —1}, define 

£ (D) e {0,l,...,|n/2j} 

h(D) e {0,1,..., \n/2]} 

such that D is constant over the interval [£o(D),n — i\{D)\ and £q{D) and £i(D) are the smallest 
possible values for which this happens. 

Paturi's theorem provides bounds on the approximate degree of symmetric functions. 

Theorem 2.2 (Paturi[13]). Let f : {0, l} n — ► {1, —1} be any symmetric function induced from the 
predicate D : {0, . . . , re} — > {!,—!}. Then, 



deg 1/3 (f) = e{ V / n(£ (D)+£ 1 (D))) (5) 
In particular, the 1/3-approximate degree of NOR is @(y/n). 

3 The Generalized Discrepancy Method 

Babai, Nisan and Szegedy [2] estimated the discrepancy of functions like GIP^ w.r.t A;-wise cylinder 
intersections and the uniform distribution. These estimates resulted in the first strong lower bounds 
in the k-party model via Theorem 12.11 Unfortunately, the applicability of Theorem 12.11 is limited 
to those functions that have small discrepancy. Disjointess is a classical example of a function that 
does not have small discrepancy. 

Lemma 3.1 (Folklore). Under every distribution fi over the inputs, 

disc k ^(DISJk) = fi(l/n). 

Proof. Let X + and X~ be the set of disjoint and non-disjoint inputs respectively. The first thing 
to observe is that if \fi(X + ) — fi[X~)\ = Q(l/n), then we are done immediately by considering 
the discrepancy over the intersection corresponding to the entire set of inputs. Hence, we may 
assume \fi(X + ) — fi(X~)\ = o(l/n). Thus, fJ>(X~) > 1/2 — o(l/n). However, X~ can be covered 
by the following n monochromatic cylinder intersections: let C, be the set of inputs in which the 
ith column is an all-one column. Then X~ = U™ =1 Cj. By averaging, there exists an i such that 
A*(Q) — l/2n — o(l/n 2 ). Taking the discrepancy of this Cj, we are done. □ 

It is therefore impossible to obtain better than O(logre) bounds on the communication complex- 
ity of Disjointness by a direct application of the discrepancy method. In fact, the above argument 
shows that Theorem 12.11 fails to give better than polylogarithmic lower bound for every function 
that is in NP^ C or co-NP^. 
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Sherstov [16, Sec 2.4] provides a nice reinterpretation of Razborov's discrepancy method for 
two party quantum communication complexity by pointing out the following: in order to prove 
a lower bound on the communication complexity of a function / in any bounded error model, it 
is sufficient to find a function g that correlates well with / under some distribution but has large 
communication complexity. Based on this observation, we modify the discrepancy method to the 
following: 

Lemma 3.2 (Generalized Discrepancy Method). Denote X = Y\ x ... X Yp.. Let f : X — > {—1, 1} 
and g : X — > {— 1, 1} be such that under some distribution fj, we have Corr^(f, g) > 5. Then 

Proof. Let V be a fc-party randomized protocol that computes / with advantage e and cost c. 
Then for every distribution fi over the inputs, we can derive a deterministic fc-player protocol V 
for / that errs only on at most 1/2 — e fraction of the inputs (w.r.t. /x) and has cost c. Take /i 
to be a distribution satisfying the correlation inequality. We know V' partitions the input space 
into at most 2 C monochromatic (w.r.t. V') cylinder intersections. Let C denote this set of cylinder 
intersections. Then, 

5 < \K x ^f(x)g(x)\ 
= I J2f( x )9(x)Kx)\ 



< 



J2'P'(x)g(x)fi(x)\ + | Y,U^)-V'{x))g{x)fi{x)\ 



Since V 1 is a constant over every cylinder intersection S in C, we have 

5 ^ Y,\T, v '^9( x )^ x )\ + T,\9( x )\\f( x )- v '( x M x ) 

sec xes x 

< J2\J29(x)fx(x)\ +Y J \f{x)-V'{x)\ii{x) 

sec xes x 

< 2 c disc fciM ( 5 ) + 2(1/2 -e). 

This gives us immediately ([6]). □ 

Observe that when f = g, i.e. Corr A1 (/, = 1, we get the classical discrepancy method 
(Theorem [21]). 



4 Generating Functions With Low Discrepancy 
4.1 Masking Schemes 

We have already defined one masking scheme through the notation x <£= yi , . . . , . This allowed us 
to define G 9 k for a base function g. Well-known functions such as GIP& and DISJ^ are respresentable 
in this notation by cP ARITY an d Gj^ OR respectively. We now define a second masking scheme which 
plays a crucial role in lowerbounding the communication complexity of G 9 k . This masking scheme is 
obtained by first slightly simplifying the pattern matrices in [16] and then generalizing the simplified 
matrices to higher dimension for dealing with multiple players. 
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Si 1 ^^ x <— Si, S2 = 001 

Figure 2: Illustration of the masking scheme x *— Si, Sz- The parameters are £ = 3, m = 3, n = 27. 

Let S 1 , . . . G [£] m for some positive £ and m. Let x € {0, l} n where n = £ l m. Here it is 
convenient to think of x to be divided into m equal blocks where each block is a k — 1-dimensional 
array with each dimension having size I. Each S l is a vector of length m with each co-ordinate being 
an element from {1, ...,£}. The k — 1 vectors 5" 1 , . . . , S* — 1 jointly unmask m bits of x, denoted by 
x <— S 1 , . . . , 5 , precisely one from each block of x i.e. 



x [1] [S 1 [1] , S 2 [1] , . . . , S*" 1 [1]] , . . . , x[m] [S 1 [m] , S 2 [m] , . . . , S k 



where x[i] refers to the ith block of x. See Figure [2] for an illustration of this masking scheme. 

For a given base function / : {0, l} m -> {-1, 1}, we define F f k : {0, l} n x ([£} m ) k - 1 -> {-1, 1} 
as F/(x, S 1 ,..., S^ 1 ) =f(x<-S 1 ,..., S^ 1 ). 

Lemma 4.1. 7/ / : {0, l} m {-1,1} and f : {0, l} n -» {-1,1} /iaue £/ie property that f(z) = 
f'(z0 n ~ m ) (here n = £ k ~ l m as described in the construction of Ft), then 

R%(Fi) < RUG{'). (7) 

Proof Sketch. Observe that there are functions r, : [£] m — > {0, l} n such that f£(x, S 1 , . . . , S k ~ r ) = 
G(. (x, ri(S' 1 ), . . . , Tk-i(S k ~ 1 )) for all x, S 1 , ... , S* -1 . Therefore the players can privately convert 
their inputs and apply the protocol for Gi . □ 

Note that the proof shows (0) holds not just for randomized but any model of communication. 
4.2 Orthogonality and Discrepancy 

Now we prove that if the base function / in our masking scheme has a certain nice property, then 
the masked function f£ has small discrepancy. To describe the nice property, let us define the 
following: for a distribution \i on the inputs, / is (/x, d)-orthogonal if ^x~u,f( x )xs( x ) = 0, for all 
\S\<d. Then, 

Lemma 4.2 (Orthogonality-Discrepancy Lemma). Let f : {— l,l} m — * { — 1,1} be any (//, d)- 
orthogonal function for some distribution \i on { — 1, l} m and some integer d > 0. Derive the proba- 
bility distribution A on {-1, l} n x ([£} m ) k ^ 1 from pt as follows: \{x,S\ . . = ^^pS^Sr* - 
Then, 

^afD) < E ( (fc 7 )m )(^r) Pi 



S 



Hence, for I — 1 > 



2 2 (fc-l)em 



and d > 2, 



disckJFi) < 



0) 



Remark. The Lemma above appears very similar to the Multiparty Degree-Discrepancy Lemma in 
[8] that is an extension of the two party Degree-Discrepancy Theorem of |17j . There, the magic 
property on the base function is high voting degree. It is worth noting that (/i, d)-orthogonality 
of / is equivalent to voting degree of / being at least d. Indeed the proof of the above Lemma is 
almost identical to the proof of the Degree-Discrepancy Lemma save for the minor details of the 
difference between our masking scheme and the one used in [8]. 

Proof of Lemma \4--S\ The starting point is to write the expression for discrepancy w.r.t. an arbi- 
trary cylinder intersection 



dis<(F/) 



f[{x, s 1 ,..., s"- 1 )^, sr 1 , ... , fi^ 1 ) • \(x, s 1 ,..., s k - r ) 

as,S 1 ,...,S fc - 1 



(10) 



This changes to the more convenient expected value notation as follows: 



disc£(F/) = T 



®x,sK.,s*-* F k( x > S \ ■ ■ ■ > Sk ~') x <K X > S 1 ,..., S*' 1 )^ ^ fi 1 , . . . , 5" 



k-l\ 



(11) 



where, (x, S 1 , . . . , S k 1 ) is now uniformly distributed over {0,lY m x ([£} m f. Then, we use 
the trick of repeatedly combining triangle inequality with Cauchy-Schwarz exactly as done in 
Chattopadhyay[8j (or even before by Raz[14j) to obtain the following: 



(disc£(F/)) 2 - < ^-^^^-x^fljj (fi \ Si,-, S k ~\ St 1 ) 



f(al ol 



(12) 



where, 



zjf l ci q1 ok— 1 ok— 1\ 

a k WO) D l> • • • > °0 ' D l ) 



x6{0,l} <fe_1 ' 



Y\ ( ^fe ( X ' ^"1 ' ' ' ' ' Su k }i )f i ( X ^~ S^ J • • • ) ^U^l ) ) 

nefO,!}*" 1 



(13) 



We look at a fixed Sg, S$, for i = 1, . . . , fc- 1. Let r f = \S l nSi\ and r = £j n for 1 < i < 2 fc - x . 
We now make two claims that are analogous to Claim 15 and Claim 16 respectively in [8]. 



Claim 4.3. 



n k \°o, °i> • • • ' D o ' °i ; — 



fe-1 



2(2 fe - 1 -l)r 
22 fc - 1 m 



Claim 4.4. Xei r < d. Then, 



fc V o , J i , ■ ■ ■ , °0 ,°1 



k—1 rrfc — 1^ 







(14) 



(15) 



We prove these claims in the next section. Claim 14.31 simply follows from the fact that fi is 
a probability distribution and / is 1/-1 valued while Claim |4~41 uses the (/i, d)-orthogonality of /. 
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We now continue with the proof of the Orthogonality-Discrepancy Lemma assuming these claims. 
Applying them, we obtain 



(k-l)m 

< ^ 2^'^ Pr [n = ji A • • • A rfc_i = jk-i) (16) 

j=d jl-i hj'fc_i=j 

Substituting the value of the probability, we further obtain: 
(disc^F/)) 2 *- 1 

3=d 3l+-+3k-l=3 

The following simple combinatorial identity is well known: 

/ m\ / m \ f(k — l)m 

Ji) \jfe-iy V 3 



iiH — hifc_i=i 



Plugging this identity into (|17p immediately yields (J8|) of the Orthogonality-Discrepancy Lemma. 
Recalling ( (fc ~ 1)m ) < ( e(fc ~- 1)m ) j , and choosing £ - 1 > 2 2fc (A: - l)em/d, we get ©. □ 

4.3 Proofs of Claims 

We identify the set of all assignments to boolean variables in X = {x%, . . . ,x n } with the n-ary 
boolean cube {0, 1}™. For any u £ {0, l} fc_1 , let Z u represent the set of m variables indexed jointly 
by , . . . , "S^rA ■ There is precisely one variable chosen from each block of X. Denote by Zi[a] 
the unique variable in Zj that is in the ath block of X, for each 1 < a < m. Let Z = U U Z U . We 
abuse notation for the sake of clarity and use Z u in the context of expected value calculations to 
also mean a uniformly chosen random assignment to the variables in the set Z u . 



Proof of Claim [7^1 

uf ( Ql Ql ok— 1 ok— IN 

n k l°0) J l ) • • • ) D t°l ) 



itG{0,l} fc_1 



(18) 



Observe that for any block a and any u ^ , Z u [a] = Z h-i [a] iff for each % such that Uj = 1, 
Sq[o\ = S\[a]. Recall that r« is the number of indices a such that Sq[cx] = 'SiM- Therefore, there 
are at most r = Ym=i r i man y indices a such that Z u [a] = Z k-i[a] for some u ^ O^ -1 . This 



means the inner expectation in (|18p is a function that depends on at most r variables. Since / 
is orthogonal under fj, with every polynomial of degree less than d and r < d, we get the desired 
result. □ 
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Proof of Claim \J^[ Observe that since is 1/-1 valued, we get the following: 

Meio.i}*- 1 
= E x _ z E z H n(Z u ) 

MG{0,l} fe - 1 

=^ x - z w\ e n /*w ( i9 ) 

ze{o,i} |z| ue-to.ip- 1 
fe— 1 

< Ex-z ^| £ n^ 1 ) ( 2 °) 

e{o,i} m 

where the last inequality holds because every product in the inner sum of (|19p appears in the inner 
sum of (|2U|) , Using the fact that [i is a probability distribution, we get: 

fc-l 

RHS of m = ^x-z J] E ^ 

i=l yig{0,l} m 

1 

We now find a lower bound on \Z\. Let t u denote the Hamming weight of the string u and 
{ ji > • • • ! jt u } denote the set of indices in [k — 1] at which u has a 1 . Define 

Y u = {Z u [a} | S{ s [a] + S{ a [a]; 1 < s < if, 1 < a < m} (21) 

The following follow from the above definition. 

• l^o*- 1 ! = m an d \Y U \ > m — ^2i <s<tj rj a > m — r for all u / fe_1 . 

• Y U C\Y V = $, for This follows from the following argument: wlog assume there is an 
index f3 where u has a one but v has a zero. Consider any block a such that Z u [a] is in Y u . 
It must be true that S± [a] / .Sq [a]. This means that Z u [a] ^ Z v [a]. Therefore Z u [a] is not 
in Y v and we are done. 

• Y := U u< z{ ^yk-iY u = Z. This is because if Z u [a] is not in Y u then there are indices ji, . . . ,j s 
where u contains a one and Sq [a] = S\ l [a]. Let v be the string that contains a zero at positions 
ji, . . . ,j s and at other positions, corresponds to u. Then by definition, Z u [a] = Z v [a] G Y v . 

Thus, \Z\ = \Y\ = EJ y «l > m + E^o( m ~ r ) = 2 k ~ l m-(2 k - 1 -l)r and the result follows. □ 

5 The Main Result 

Before proving the main result, we borrow from Sherstov [16] a beautiful duality between approx- 
imability and orthogonality. The intuition is that if a function is at a large distance from the linear 
space spanned by the characters of degree less than d, then its projection on the dual space spanned 
by characters of degree at least d is large. More precisely, 
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Lemma 5.1. Let f : {— 1, l} m — ► R be given with deg s (f) = d > 1. T/ien i/iere exists g : 
{ — 1, l} m — > { — 1, 1} flnc? a distribution fi on { — 1, l} m such that g is (/i, d)-orthogonal and Corr^(f, g) > 
5. 

We do not prove this Lemma but the interested reader can read its short proof in [TB] which is 
based on an application of linear programming duality. 

Theorem 5.2 (Main Theorem). Let f : {0, l} m — ► {—1,1} have 5 -approximate degree d. Let 
n > ( 2 " {k ~ 1)e ) k ' l m k , and f : {0, l} n -> {-1, 1} be such that f(z) = /'(z0 n " m ). Then 

Rl(G()>^- l+ \og{5 + 2e-l). (22) 

Proof. Applying Lemma 15. II we obtain a function g and a distribution \i such that Corr jU (/, #) > 5 
and Ex^^^OxsM = for |£| < d. These 5 and // satisfy the conditions of Lemma 14,2} therefore 
we have 

di S c fe , A (Ff) < (23) 

where A is obtained from {i as stated in Lemma 14.21 and I > 2 2k (k — l)em/d. Since n = l k ~ l m, 

PHj) holds for n > ( 2 ( ^ 1)e ) fc ~ 1 m fc . 

It can be easily verified that Corv\(F^, F%) = Corr At (/, g) > 5. Thus, by plugging the value of 
discfc^-Ffc) in © of the generalized discrepancy method we get 

fl|(F/)>^ I + log(5 + 2e-l). 
The desired result is obtained by applying Lemma 14.11 □ 
5.1 Disjointness Separates BPP^ C and NP^ 

As a corollary to our main theorem, we obtain the following lower bound for the Disjointness 
function. 



Corollary 5.3. 



1 

R e k (DISJ k ) = n\ 



^2 2k (k- l)2 k -\ 
for any constant e > 0. 

Proof Let / = NOR m and /' = NOR n . We know deg 1/3 (NOR m ) = Q(y/m) by Theorem EH 

Setting n = ( d( J ,^nqr j )* 1?ri ' C i an d writing (I22p in terms of n gives the result for any constant 
e > 1/6. The bound can be made to work for every constant e by a standard boosting argument. □ 

Observe that we get the same bound for the function G? . It is not difficult to see that 
there is a O(logn) bit non-deterministic protocol for G® R and therefore this function separates the 
communication complexity classes BPPj^ and NPj? for all k = o(loglogn). 



12 



5.2 Other Symmetric Functions 

Theorem 15.21 does not immediately provide strong bounds on the communication complexity of 
for every symmetric /. For instance, if / is the MAJORITY function then one has to work a little 
more to derive strong lower bounds. 

In this section, using the main result and Paturi's Theorem (Theorem I2.2p . we obtain a lower 
bound on the communication complexity of Gt for each symmetric /. Let / : {0, l} ra — > {1, —1} be 
the symmetric function induced from a predicate D : {0,1, ... ,n} — ► {1, —1}. We denote by Gj? 
the function G[. For t G {0, 1, . . . , n- 1}, define D t : {0, 1, . . . , n-t} -► {1,-1} by D t (i) = D{i+t). 
Observe that the communication complexity of GP is at least the communication complexity of 

Corollary 5.4. Let D : {0,1,..., n} be any predicate with deg 1 j ?J {D) = d. Let £o = Iq{D) and 
4= 4(D). Define T :N^N by 



T(n) 

Then for any constant e > 0, 



n 



(2 2 "(k-l)e/d) 



k-1 



ffi.(Gj?) = ft(*(4i) + ^ 



2 

where 

Proof. There are three cases to consider. 

Case 1: Suppose £q < T(n)/2. Let D' : {0,1, ... ,T(n)} — > {1,-1} be such that for any z £ 
{0,l} T ( n ), we have D(\z\) = D'{\z\). By Theorem E g the compl exity of Gf is Q(d/2 k - 1 ) where 
d = deg 1/3 (£>'). By Paturi's Theorem, deg 1/3 {D r ) > ^/T{n)£ {D') = ^T(n)£ and so 



Case 2: Suppose T(n)/2 < £o < ti/2. We find a lower bound on the communication complexity of 
G A where t = £ - T(n-£ )/2. Let D' t : {0, 1, . . . , T(n - £ )} {1,-1} be such that £>£(|*|) = 
-Dt(|z|). So by Theorem 15. 2\ the complexity of G® 1 is Q(d/2 k ~ 1 ) where d is the approximation 
degree of D[. We know 

D' t (T(n -£ )/2) = D t (T(n - £ )/2) 

= D(T(n-£ )/2 + £ -T(n-£ )/2) 

= D(£ ) 

¥= D(£ -l) 

= D' t (T(n -40/2-1). 



Thus by Paturi's Theorem, deg 1/ / 3 (Dj) > \/T(n — £q) 2 /2. This implies 

^(Gj^) = n( 



^ <>fli!!_ 

2 fc- 
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Case 3: Suppose £q = and i\ ^ 0. The argument is similar to the one for Case 2. Consider Dt 
where t = n - l x -T(£ 1 )/2. Let D[ : {0, 1, . . . , T{h)} -► {1,-1} be such t hat D[{\z\ ) = D t {\z\). 
As in case 2, one sees that D' t (T{£ 1 )/2) / D' t (T(£ 1 )/2 + 1), so deg 1/3 (D' t ) > ^JT{£ X ) 2 j2. Therefore, 



i?|(Gf) = n( 



)■ 



Combining these three cases, we get the desired result. 



□ 
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